plotly, colors, and themes


The graphs below don’t have proper titles, axis labels, legends, etc. Please take care to do this on your own graphs.


Use plotly to make visuals interactive

Rather than spending a lot of time to make a fully interactive app, plotly makes it incredibly easy to take an existing ggplot object, then use the ggplotly function to make it interactive. For instance, let’s take one of our old penguins plots and assign it to an object named scatter_plain:

library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.4.1      ✔ purrr   1.0.1 
## ✔ tibble  3.1.8      ✔ dplyr   1.0.10
## ✔ tidyr   1.3.0      ✔ stringr 1.5.0 
## ✔ readr   2.1.3      ✔ forcats 0.5.2 
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
library(palmerpenguins)

scatter_plain <- penguins %>% 
  ggplot(aes(x = body_mass_g, y = bill_length_mm, color = species)) +
  geom_point(alpha = 0.5, size = 2) +
  labs(x = "Body Mass (g)", y = "Bill Length (mm)") +
  theme_bw()
scatter_plain
## Warning: Removed 2 rows containing missing values (`geom_point()`).

Now, we’re going to load the plotly package and use ggplotly to make this plot interactive:

library(plotly)
## 
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
## 
##     last_plot
## The following object is masked from 'package:stats':
## 
##     filter
## The following object is masked from 'package:graphics':
## 
##     layout
ggplotly(scatter_plain)

Notice that there are several ways you can interactive with the plot, you can filter, zoom, and get additional information by hovering over the points with the tooltip. You can also customize what’s displayed in the tooltip in various ways. For instance, I can update the above plot to include text denoting the penguin’s sex. Then when we call ggplotly we can specify what is included in the tooltip when hovering over the points:

scatter_upd <- penguins %>% 
  ggplot(aes(x = body_mass_g, y = bill_length_mm, color = species,
             text = paste("sex:", sex))) +
  geom_point(alpha = 0.5, size = 2) +
  labs(x = "Body Mass (g)", y = "Bill Length (mm)") +
  theme_bw()
# Display the text with the sex variable and the penguins species with the tooltip
ggplotly(scatter_upd, tooltip = c("text", "species"))

Notes on colors in plots

Three types of color scales to work with:

  1. Qualitative: distinguishing discrete items that don’t have an order (nominal categorical). Colors should be distinct and equal with none standing out unless otherwise desired for emphasis.
  • Do NOT use a discrete scale on a continuous variable
  1. Sequential: when data values are mapped to one shade, e.g., in a choropleth, for an ordered categorical variable or low to high continuous variable
  • Do NOT use a sequential scale on an unordered variable
  1. Divergent: think of it as two sequential scales with a natural midpoint midpoint could represent 0 (assuming +/- values) or 50% if your data spans the full scale
  • Do NOT use a divergent scale on data without natural midpoint

Options for ggplot2 colors

The default color scheme is pretty bad to put it bluntly, but ggplot2 has ColorBrewer built in which makes it easy to customize your color scales. For instance, we change the palette for the species plot from before.

penguins %>% 
  ggplot(aes(x = body_mass_g, y = bill_length_mm, color = species)) +
  geom_point(alpha = 0.5, size = 2) +
  scale_color_brewer(palette = "Set2") +
  labs(x = "Body Mass (g)", y = "Bill Length (mm)") +
  theme_bw()
## Warning: Removed 2 rows containing missing values (`geom_point()`).

Something you should keep in mind is to pick a color-blind friendly palette. One simple way to do this is by using the ggthemes package which has color-blind friendly palettes included:

penguins %>% 
  ggplot(aes(x = body_mass_g, y = bill_length_mm, color = species)) +
  geom_point(alpha = 0.5, size = 2) +
  ggthemes::scale_color_colorblind() +
  labs(x = "Body Mass (g)", y = "Bill Length (mm)") +
  theme_bw()
## Warning: Removed 2 rows containing missing values (`geom_point()`).

In terms of displaying color from low to high, the viridis scales are excellent choices (and are also color-blind friendly!).

penguins %>% 
  ggplot(aes(x = body_mass_g, y = bill_length_mm, 
             color = flipper_length_mm)) +
  geom_point(alpha = 0.5, size = 2) +
  scale_color_viridis_c() +
  labs(x = "Body Mass (g)", y = "Bill Length (mm)",
       color = "Flipper Length (mm)") +
  theme_bw()
## Warning: Removed 2 rows containing missing values (`geom_point()`).

Notes on themes

We have not explicitly talked about this throughout the semester, but you have seen various changes to the theme of plots for customization. You will constantly be changing the theme of your plots to optimize the display. Fortunately, there are a number of built-in themes you can use to start with rather than the default theme_gray():

penguins %>% 
  ggplot(aes(x = body_mass_g, y = bill_length_mm, color = species)) +
  geom_point(alpha = 0.5, size = 2) +
  ggthemes::scale_color_colorblind() +
  labs(x = "Body Mass (g)", y = "Bill Length (mm)") +
  theme_gray()
## Warning: Removed 2 rows containing missing values (`geom_point()`).

For instance, you have seen me use theme_bw() many times throughout the semester:

penguins %>% 
  ggplot(aes(x = body_mass_g, y = bill_length_mm, color = species)) +
  geom_point(alpha = 0.5, size = 2) +
  ggthemes::scale_color_colorblind() +
  labs(x = "Body Mass (g)", y = "Bill Length (mm)") +
  theme_bw()
## Warning: Removed 2 rows containing missing values (`geom_point()`).

There are options such as theme_minimal():

penguins %>% 
  ggplot(aes(x = body_mass_g, y = bill_length_mm, color = species)) +
  geom_point(alpha = 0.5, size = 2) +
  ggthemes::scale_color_colorblind() +
  labs(x = "Body Mass (g)", y = "Bill Length (mm)") +
  theme_minimal()
## Warning: Removed 2 rows containing missing values (`geom_point()`).

or theme_classic():

penguins %>% 
  ggplot(aes(x = body_mass_g, y = bill_length_mm, color = species)) +
  geom_point(alpha = 0.5, size = 2) +
  ggthemes::scale_color_colorblind() +
  labs(x = "Body Mass (g)", y = "Bill Length (mm)") +
  theme_classic()
## Warning: Removed 2 rows containing missing values (`geom_point()`).

There are also packages with popular, such as the ggthemes package which includes, for example, theme_economist():

library(ggthemes)
penguins %>% 
  ggplot(aes(x = body_mass_g, y = bill_length_mm, color = species)) +
  geom_point(alpha = 0.5, size = 2) +
  ggthemes::scale_color_colorblind() +
  labs(x = "Body Mass (g)", y = "Bill Length (mm)") +
  theme_economist()
## Warning: Removed 2 rows containing missing values (`geom_point()`).

and theme_fivethirtyeight() to name a couple:

penguins %>% 
  ggplot(aes(x = body_mass_g, y = bill_length_mm, color = species)) +
  geom_point(alpha = 0.5, size = 2) +
  ggthemes::scale_color_colorblind() +
  labs(x = "Body Mass (g)", y = "Bill Length (mm)") +
  theme_fivethirtyeight()
## Warning: Removed 2 rows containing missing values (`geom_point()`).

With any theme you have picked, you can then modify specific components directly using the theme() layer. There are many aspects of the plot’s theme to modify, such as my decision to move the legend to the bottom of the figure, drop the legend title, and increase the font size for the y-axis:

penguins %>% 
  ggplot(aes(x = body_mass_g, y = bill_length_mm, color = species)) +
  geom_point(alpha = 0.5, size = 2) +
  ggthemes::scale_color_colorblind() +
  labs(x = "Body Mass (g)", y = "Bill Length (mm)",
       title = "Larger penguins tend to have larger bills",
       subtitle = "Positive relationship between mass and length is consistent across species") +
  theme_bw() +
  theme(legend.position = "bottom",
        legend.title = element_blank(),
        axis.text.y = element_text(size = 14),
        axis.text.x = element_text(size = 6))
## Warning: Removed 2 rows containing missing values (`geom_point()`).

If you’re tired of explicitly customizing every plot in the same way all the time, then you should make a custom theme. It’s quite easy to make a custom theme for ggplot2 and of course there are an incredible number of ways to customize your theme. In the code chunk, I modify the theme_bw() theme using the %+replace% argument to make my new theme named my_theme() - which is stored as a function:

my_theme <- function () {
  # Start with the base font size
  theme_bw(base_size = 10) %+replace%
    theme(
      panel.background  = element_blank(),
      plot.background = element_rect(fill = "transparent", color = NA), 
      legend.position = "bottom",
      legend.background = element_rect(fill = "transparent", color = NA),
      legend.key = element_rect(fill = "transparent", color = NA),
      axis.ticks = element_blank(),
      panel.grid.major = element_line(color = "grey90", size = 0.3), 
      panel.grid.minor = element_blank(),
      plot.title = element_text(size = 18, 
                                hjust = 0, vjust = 0.5, 
                                face = "bold", 
                                margin = margin(b = 0.2, unit = "cm")),
      plot.subtitle = element_text(size = 12, hjust = 0, 
                                   vjust = 0.5, 
                                   margin = margin(b = 0.2, unit = "cm")),
      plot.caption = element_text(size = 7, hjust = 1,
                                  face = "italic", 
                                  margin = margin(t = 0.1, unit = "cm")),
      axis.text.x = element_text(size = 13),
      axis.text.y = element_text(size = 13)
    )
}

Now I can go ahead and my plot from before with this theme:

penguins %>% 
  ggplot(aes(x = body_mass_g, y = bill_length_mm, color = species)) +
  geom_point(alpha = 0.5, size = 2) +
  ggthemes::scale_color_colorblind() +
  labs(x = "Body Mass (g)", y = "Bill Length (mm)",
       title = "Larger penguins tend to have larger bills",
       subtitle = "Positive relationship between mass and length is consistent across species") +
  my_theme()
## Warning: The `size` argument of `element_line()` is deprecated as of ggplot2 3.4.0.
## ℹ Please use the `linewidth` argument instead.
## Warning: Removed 2 rows containing missing values (`geom_point()`).